Human-centric Video Understanding with Weak Supervision a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

نویسنده

  • Vignesh Ramanathan
چکیده

A large fraction of videos such as entertainment, sports and surveillance videos are centered around people. We need efficient ways to index such content, i.e., understand and describe people: Who are they? What are their roles? What are their actions and intentions? One major challenge is that, training computer vision models for these tasks typically requires extensive spatial and temporal annotations. Such annotations are often very expensive and difficult to collect at the scale of thousands of videos. We could handle this problem by learning from weakly labeled videos, which are readily available and cheaper to collect. However, in such videos the person-labels are not spatially/temporally localized. In this thesis, we will present models which can learn from weakly labeled videos by automatically aligning the labels with the right people in the video to identify their (i) names (ii) social roles and (iii) actions. In the first part of this thesis, we consider the problem of identifying the names of people in weakly labeled videos. In particular, we deal with one widely available source of weakly labeled videos in the form of TV episodes. These videos are only accompanied by TV-scripts, which provide a noisy description of the characters appearing in different parts of the episodes. The descriptions are often not well aligned with the video, making the task more challenging. Further, people in the script are not only mentioned by name but also by pronouns such as “he”, “she” and nominals such as “doctor”, “teacher” etc. This adds to the ambiguity in aligning human mentions in the script with their actual appearance in the video. We address these problems by proposing a joint optimization framework for resolving name references in the text (coreference resolution) and name assignments in video. This joint model leads to better performance in both tasks and is evaluated on a dataset of 19 TV-episodes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supporting Effective Interaction with Tabletop Groupware a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

.............................................................................................................iv Acknowledgments..............................................................................................vi

متن کامل

Gaze-enhanced User Interface Design a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

........................................................................................................ iv Acknowledgments ..................................................................................... vi

متن کامل

Structuring Peer Interactions for Massive Scale Learning a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

....................................................................................................................... iv Acknowledgments ........................................................................................................ vi Table of

متن کامل

Haptics and Physical Simulation for Virtual Bone Surgery a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

......................................................................................................... iv Acknowledgments .......................................................................................... vi

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016